DOCUMENT 



RESUME 



I 



ED 022 565 



PS 001 258 



By-Mostofsky, David 

HEAD START EVALUATION AND RESEARCH CENTER, BOSTON UNIVERSITY. 
PREFERENCES AMONG QUALITATIVELY DIFFERING UNCERTAINTIES. 



REPORT D-III, A STUDY 



OF 



Boston Univ, Mass. 

Spons Agency- Office of Economic Opportunity, Washington, D.C. 
Pub Date 67 



Note*7p. 

EDRS Price MF-S025 HC-S0.36 _ 

Descriptors-BEHAVIOR PATTERNS. CULTURALLY DISADVANTAGED, ♦MOTIVATION JEC^IQ^S. f^TTERNED 
RESPONSES, ♦PRESCHOOL CHILDREN, *PROBABILITY, REINFORCEMENT, ♦RESPONSE MODE, ♦REWARDS 



Identifiers _ Head Start 

The purpose of this study was to measure nonverbally the preference of 
alternative responses when the net probability of being rewarded was the same. A 
demonstration of preference under these circumstances would suagest the ability to 
control or maintain behavior without explicit administration of a reinforcing agent. Head 
Start children were used as subjects. They were provided, in the experimental situation, 
with a two-button console. The right button, when pushed, resulted in the illumination of 
a yellow light and the dispensing of a penny for every second illumination (a consistent 
reward schedule). The pushing of the left button would result in a 50 percent chance 
of the illumination of a red light, which was never followed by reward, and a 50 percent 
chance of the illumination of a green light, which was always rewarded. Thus, whichever 
button was pushed, there followed a net 50 percent chance of reward. However, only 
the right button provided a consistent 50 percent reward. The results indicate that 
children prefer a consistent reward situation to a reward uncertainty situation. (WD) 
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ABSTRACT 



Head St.art children participated in an experiment in which rewards were made 
available. Regardless of the child’s position response (right or left) the prob- 
ability of reward was always p- . 5 . Discriminative stimuli were made available; 
one side imperfectly correlated with the subsequent ’‘availability of reward, the other 
perfectly correlated. Preference for the "consistent side" was evidenced. Implications 
for application as a non-verbal diagnostic and training model are discussed. 



* "The research reported herein was performed pursuant to a contract with the Office 
of Economic Opportunity, Executive Office of the President, Washington, D.C.2Q506. 

The opinions expressed herein are those of the author and should not be construed 
as representing the opinions or policy of any agency of the United States Government." 
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HEAD START EVALUATION AND RESEARCH CENTER 



A STUDY OF PREFERENCES AMONG QUALITATIVELY DIFFERING UNCERTAINTIES 1 

David Mostofsky 
Boston University 



Problems in thinking and concept formation have had a long history in the 
literature of Psychology. It is only relatively recent, however, that the focus 
of interest has turned to an adaptation of refined techniques in response to 
questions of concern such as decision processes in children as a function of 
stimulus as well as cultural conditions has been observed by many. The Educational 
Testing Service has for one had more than a casual interest in the nature of such 
choice responding. A series of papers (representative of the work by Rosenhan 
(1966a) is evidence of such studies. The Rosthhan type of study has not contented 
itself with the examination of alternation behavior in the neutral laboratory, but 
has been equally concerned with the affects of social class and race on responsive- 
ness to the effects of reinforcement (Rosenhan 1966b) . The purpose of the present 
study is also directed to a specific activity of choice and preference. The partic- 
ular interest in this study is to utilize a non-verbal task which- cati -give evidence of 
preferences exhibited under 'conditions^of equal satisfaction(reward)) . Such' a demon- 
stration would suggest the ability to control or maintain behavior without explicit 
administration of a reinforcing agents Historipally, this problem -might be though of 
as beginning with a master’s dissertation by Prokasy(1956) and extended more recently 
fty Bowdt et .,al(1966) . 1 The model inherent in these investigations may also be seen in 
a series of studies pursued by Weir (1964, 1965)’ in which children’s preferences Vere 
observed indepenuent of reward consequences. 

A number of conditions strongly argue for investigating alternative behavioral 
techniques which might be of general value both for diagnostic and remedial purposes. 
Principally, such techniques would place minimal emphasis or requirement on the use 
of verbal behavior. To the extent that debilitations of either a cultural of intel- 
lectual nature affect the learning or other performance capabilities of children and 
adults, the wisdom of using any verbal based investigatory or measurement scheme 
seems highly questionable. A host of theoretical issues surround this problem of 
non-verbal alternatives, not the least of which concerns a position formulated most 
explicitly by B,F. Skinner, viz. that the most complicated classes of human and infra- 
human behavior* are subject to the same fundamental laws of * control. ■ In dealing with 
the problems and objectives of learning in children, and adults, the advent of program 
instruction and teaching machines have tended to demonstrate the efficacy* of reinforce- 
ment; and its associated schedule of dispensation^ The data tend to convey an expec- 
tancy of success in the modification or maintenance of verbal and other behaviors by 
recourse to known and existing reinforcement schemes. These procedures and schemes 
usually differ with respect to parameters such a3 the magnitude, frequency, and rate 
of reinforcement availability. Staats and Staats(1963) in their work "Complex Human 
Behavior" discuss a number of variables which have successfully been explored in 
conjunction with the study and control of human behavior. 



1 "The research reported herein was performed pursuant to a contract with the Office 
of Economic Opportunity, Executive Office of the President, Washington, D.C. 20506. 
The opinions expressed herein are those of the author and should not be construed 
as representing the opinions or policy of any agency of the United States 
Government . " 
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,Outside the laboratory situation, however, the realities of -utilizing primary 
reinforcement would seem constraining features for large-scale implementation. 
Instrumentation problems aside, certain sub-classes of the subjects, who might other- 
wise benefit most from the favorable features of these procedures, may well suffer 
from unanticipated problems, such as the; utility of the reinforcement; ‘i e. the 
subjective utility of the reinforcement as perceived by the subject. Such a hypoth- 
esis was advocated by Rosenhan in a series of studies executed tinder the aegis of 
ETS. Their studies suggested the instability of reinforcement utilities, and that 
the perceived value of the reinforcement differs as a function of racial group 
membership. It would appear desirable, therefore, to (1) capitalize on the desirable 
qualities of the reinforcement and operant approach, (2) incorporate the' technology 
of the experimental analysis of behavior in investigating not only learning behaviors, 
but the less art iculat able qualities which surround the learning situation, and (3) 
to bypass the liabilities of utility, satiation, etc. which accompany * the use of v 
primary reinforcement. The present study represents a modest attempt' to satisfy 
these criteria. 

The theme* of this study may be simply described as measuring non-verbally the 
preference among alternatives whose net probability of payoff is the same* Weir (1964) 
had investigated a situation where children were able to manipulate one of two plunges— 
left or right. The left plunger was programmed so as to dispense reinforcement 
alternately i.e 0 50% of the time (every other' response) *, while the right' plunger was 
also programmed to dispense* reinforcement 50% of* the time, however on a' random basis. 

In general, the left plunger was preferredo Subsequent extensions of this design 
enabled Weir to program the right plunger for values other than 50%. Weir interpreted 
his experiment as demonstrating preferences under the conditions of consistency 
(alternate) versus non— consistency (random) , since children indicated a preference 
for the "consistent" side. There are several difficulties in this kind of interpreta- 
tion: (1) to identify the preference as between the consistence versus non- 

consistence, does not offer any additional explanatory value <> The terminology* or 
concept of consistency as used essentially describes the technique alone. While it 
may be a convenient label for the experimenter to differentiate the plungers with r 
respect to the rules with which reinforcement is made available, it has no other 
enfoellidiingor connotative properties. It certainly contributes nothing to an account 
of the subject’s behavior. It only accounts for the behavior of* the machinery. (2) 

The Weir experiment, when looked at closely, is nothing more* than* a comparison of 
components in a concurrent schedule. The alternating schedule (in which every other 
trial pays off) is a fixed ratio of 2 (FR2) , whereas the random schedule is a variant 
of the VR or VI schedule. Weir’s data, therefore, are consistent with the comparisons 
of data from incompatable concurrent schedules as described by Catanla(1966) 0 It does 
not, however, demonstrate control by any feature other than the schedule? 



A different paradigm was, therefore, adopted in the present study. The approach 
derives from a master's thesis by Prokasy who used rats in a T maze situation. Th& 
legs of the T were varied such that the right leg, painted striated, always led to a 
delay chamber which was also striatedo Following a delay period in the striated chamber 
the animal was released to the end of the arm in which reinforcement was either made 
available or not made available, with a probability of .5. If the animal shose the 
left arm of the T maze, he entered a delay chamber which was either striated or solid 
color. If it was striated, release from the chamber was constantly followed by 
reinforcement. If it was not striated, release was never followed by reinforcement 
i.e., CRF versus extinction. Again, the probability of CRF or extinction was .5? 
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The net expected probability of payoff to either side of the T maze was, therefore, 
the same. The animal’s preference for one aim over the other would reflect his 
preference for a payoff consistently associated with some discriminative property 
of the delay chamber rather than a payoff not consistently associated with some 
discriminative property of the delay chamber , Prokasy found a preference for the 
consistent side," His interpretation relies heavily on the hypothesized optimiza- 
tion of anticipatory salivating behavior which the animal undertakes in the delay 
chamber and which facilitates the terminal &onsummatory response. Such a preference, 
if found with humans, would not lend itself to the anticipatory salivating explanation, 
although it very well might lend itself to an explanation of added reinforcement in 
a form of conditioned reinforcement provided by the association of the discriminative 
stimulus with the payoff o It is this model which has been incorporated in the present 
study o The data would enable a general statement of preference (the control of per- 
formance) between responses of different topologies having equally objective proba- 
bilities of payoff by subjects for whom learning and cultural deprivations may differ. 
This differs from the Weir experiment in that the reinforcement schedule is identical 
in both response modelso Consistency may thus be meaningfully ascribed to the situa- 
tion in relation to the predictability of reward subsequent to the onset of a 
discriminative stimulus. 

Procedure 

Subjects were brought to a room (which was part of a two— room suite) in which 
one wall was a one-way observation mirroro The subject’s room housed a response 
console (see figure 1). The subject's working area on the console- contained two 
buttons (Microswitch#2C206) which when depressed actuated a double pole-switch. The 
buttons were 5” apart. Each unit could be illuminated "independent of either the 
subject's responses or the switching function. Facing the subject was a screen 
which deflected pennies which were dispensed from a Gerbrand feeder mounted in the 
reat of th’e console. Above the feeder-shield was mounted a resettable six-digit 
counter. Control equipment was housed in the adjacent experimenter's room. The 
control equipment was programmed such that continuous operation of the right button 
caused it to become illuminated yellow on the average of every 20 seconds o Once a 
switch became illuminated all other functions on the console remained inoperative 
until that same lit key was depressed again. The subject, therefore, had to respond 
to the yellow button (i.e. the illuminated manipulandum) for any functional change 
to take place for him. Responding to the yellow button led to a dispensing of re- 
inforcement on a random average of 50% of the time 0 The subject's reinforcement 
consisted either of an increase in the counter, a dispensing of a penny, or both. 

The button light would then go off and the session continued. ..n the case of children 

the use of the penny was defined not for its own utility but rather served as a token. 

Prior to the session the subject was invited to a room containing a toybin from which 

he selected a toy of his choice. He was then told he would be playing a game for 
which he might earn pennies and each penny was to be placed in a bank which was a 
transparent jar. When the jar was filled the toy of the child's choice would then 

be given to him. Both the toy and the jar were constantly in the subject s view 

during the course of the experiment (procedure of Staats and Staats)o 

With respect to the right button, therefore, on the average it would become 

lit the probability of reinforcement following yellow, equal to .5, Continual 
operation of the left button would occasionally (VI of 20) also turn color However, 
50% of the time it would turn red, 50% of the time, green The consequence of 



operating the left button once it was lit was perfectly correlated with its color o 
If the button was red, its operation would simply lead to the termination of the 
color and no reinforcement made available If the button was green, its operation 
would lead to a dispensing of reinforcement followed by termination of the color 
and the session continued.. All responses and latencies were recorded, Response 
rate was recorded on standard cumulative recorders. The subjects for the experiment 
were recruited from an available Headstart center in operation in Revere, 
Massachusetts, They were transported in groups of three from Revere to the ex- 
perimental chambers at our laboratory and were individually tested for the duration 
of the experiment 9 Other subjects were selected for the experiment from among those 
participating in on-campus programs , The subjects included adults as well as 
children* All subjects were first tested for color blindness using the Dvorin 
Pseudoisochromatic Charts 0 The analyses of interest with respect to the data, 
concerns the preference of the left red/green button over the yellow button, ice*, 
the consistent versus the non^consistent side with respect to a) total number of 
responses, b) the relative frequency of responding, c) the rate of responding and 
d) the latency of responses, respectively, The experiment is continuing for several 
reasons: (1) the sample size precludes the use of powerful tests and, therefore, 

inadequate for any definitive conclusions * (2 V certain control conditions are yet 
to be run, among them extinction following preference behavior and the use of yellow- 
blue each with a "25% reinforcement compared to the red-green (as a counterbalance 
for the possible novelty effects). 

The results give evidence that a) children will prefer a "consistent” reward 
situation- to a reward uncertainty* situation: This preference was exhibited to some 

degree by * each child tested . The data taken collectively * are in - agreement , i . e . 
response rate, total response output, and response latency i bV Adults, while 
seeking to arrive at an optimal "strategy” for maximizing payoff, also behave under 
the control of' the control of the uncertainties 0 e) The individual' differences and 
variability exhibited in the extent of preference can be minimized and the pref- 
erence correspondingly accentuated — by introducing a change**over delay® We have 
done this manually and plan in succeeding experiments to build it in as a feature 
of our logic and control instrumentation. 

These findings strongly argue for sober reconsideration of the effectively 
controlling stimuli in an applied learning situation 0 While not deprecating the 
force of reward per se, the data clearly imply that much learning can be efficiently 
directed by the manipulation of other environmental features — some of which may be 
considered to have acquired secondary or conditioned reinforcement properties , It 
would seem imperative to attempt a translation of these findings for implementation 
in the classroom* The need is even greater where mere increases in reward is con- 
traindicated, The experiments will be continued and will involve larger numbers of 
experimental units for adults, normal children, retardates 0 In addition, supportive 
experimentation will be concurrently pusued i e 

1) extinction following stabilized behavior (using existing schedule) 

2) extinction following stabilized behavior (using change-over delay) 

3) forced trials to each manipulandum condition 

4) red-green us yellow-blue (instead of r-g us y-y) 
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The conduct of such laboratory investigations surely constitutes a necessary 
prerequisite * for effective large scale adaptation and for increasing the effec- 
tiveness of educational practices in the real and applied world. 

The data analyzed thusfar suggests that the predictions which would have been 
expected on the basis of reinforcement theory are substantiated in these studies. 

As such,, its suggested relevance to the applied situation becomes quite substantial 
if the reliability of these findings are justified. That is, it does suggest that 
the use of secondary reinforcement may be an applicable reinforcer for Headstart 
programs. If this is the case, the failings which are encountered with primary 
inforcement situations might be avoided and would not be sensitive to individual 
differences with respect to utility, satiation, or other factors related to racial 
and ethnic composition 0 Furthermore, it strongly suggests the advisability of paying 
closer heed to whatever reinforcement qualities attend the consistency of reinforce- 
ment schedules o It further suggests that the reinforcing advantages of discriminative 
stimuli should not be overlooked even when primary reinforcement is usedo Perhaps a 
behavioral analysis of 11 consistency” may yet be profitable * 
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